1,009 research outputs found

    Optimization-based modeling of Lombard speech articulation:Supraglottal characteristics

    Get PDF
    This paper shows that a highly simplified model of speech production based on the optimization of articulatory effort versus intelligibility can account for some observed articulatory consequences of signal-to-noise ratio. Simulations of static vowels in the presence of various background noise levels show that the model predicts articulatory and acoustic modifications of the type observed in Lombard speech. These features were obtained only when the constraint applied to articulatory effort decreases as the level of background noise increases. These results support the hypothesis that Lombard speech is listener oriented and speakers adapt their articulation in noisy environments.</p

    Optimal control of speech with context-dependent articulatory targets

    Get PDF
    This paper presents a computational implementation of phonetic planning which consists of choosing the position of articulatory targets which satisfy conflicting linguistic and extra-linguistic requirements. We present a minimal model that considers intelligibility and least effort as task requirements. To achieve the context-dependent variability of targets, our model approximates intelligibility as a function of target phoneme recognition probability given a vector of articulatory parameters. Preliminary experiments show that our minimal computational model of phonetic planning is able to predict two types of hypoarticulation by adjusting the weight assigned to effort: vowel centralization and stop consonant lenition.Peer reviewe

    Copy synthesis of phrase-level utterances

    Get PDF
    International audience—This paper presents a simulation framework for synthesizing speech from anatomically realistic data of the vocal tract. The acoustic propagation paradigm is appropriately chosen so that it can deal with complex geometries and a time-varying length of the vocal tract. The glottal source model designed in this paper allows partial closure of the glottis by branching a posterior chink in parallel to a classic lumped mass-spring model of the vocal folds. Temporal scenarios for the dynamic shapes of the vocal tract and the glottal configurations may be derived from the simultaneous acquisition of X-ray images and audio recording. Copy synthesis of a few French sentences shows the accuracy of the simulation framework to reproduce acoustic cues of natural phrase-level utterances containing most of French natural classes while considering the real geometric shape of the speaker

    Glottal Opening and Strategies of Production of Fricatives

    Get PDF
    International audienceThis work investigates the influence of the gradual opening of the glottis along its length during the production of fricatives in intervocalic contexts. Acoustic simulations reveal the existence of a transient zone in the articulatory space where the frica-tion noise level is very sensitive to small perturbations of the glottal opening. This corresponds to the configurations where both frication noise and voiced contributions are present in the speech signal. To avoid this unstability, speakers may adopt different strategies to ensure the voiced/voiceless contrast of frica-tives. This is evidenced by experimental data of simultaneous glottal opening measurements, performed with ePGG, and audio recordings of vowel-fricative-vowel pseudowords. Voice-less fricatives are usually longer, in order to maximize the number of voiceless time frames over voiced frames due to the crossing of the transient regime. For voiced fricatives, the speaker may avoid the unstable regime by keeping low frication noise level, and thus by favoring the voicing characteristic, or by doing very short crossings into the unstable regime. It is also shown that when speakers are asked to sustain voiced fricatives longer than in natural speech, they adopt the strategy of keeping low frication noise level to avoid the unstable regime

    A glottal chink model for the synthesis of voiced fricatives

    Get PDF
    International audienceThis paper presents a simulation framework that enables a glottal chink model to be integrated into a time-domain continuous speech synthesizer along with self-oscillating vocal folds. The glottis is then made up of two main separated components: a self-oscillating part and a constantly open chink. This feature allows the simulation of voiced fricatives, thanks to a self-oscillating model of the vocal folds to generate the voiced source, and the glottal opening that is necessary to generate the frication noise. Numerical simulations show the accuracy of the model to simulate voiced fricative, and also phonetic assimilation, such as sonorization and devoicing. The simulation framework is also used to show that the phonatory/articulatory space for generating voiced fricatives is different according to the desired sound: for instance, the minimal glottal opening for generating frica-tion noise is shorter for /z/ than for /Z/

    Estimation de la longueur du conduit vocal pour l'inversion acoustique-articulatoire

    Get PDF
    National audienceLa géométrie complexe du conduit vocal rend le problème d'inversion acoustique-articulatoire difficile, notamment de par son caractère fortement mal-posé. La régularisation passe par l'ajout de contraintes, soit articulatoires (modèle articulatoire, nécessitant peu de paramètres, mais nécessitant d'être adapté à chaque locuteur), soit sur les valeurs des fonctions d'aires. Dans ce cas, la longueur du conduit vocal est généralement fixée à une certaine valeur arbitraire, ne permettant pas d'analyser des éventuelles protrusions ou des élongations/raccourcissements du pharynx. L'étude présentée ici propose une approche permettant d'estimer la longueur du conduit vocal de tout locuteur à partir de l'enregistrement du signal de parole. La méthode utilisée est une méthode analyse par synthèse consistant à retrouver la fonction d'aire générant les formants estimés du signal de parole du locuteur. Elle est effectuée à partir d'une fonction d'aire initiale que l'on modifie itérativement selon la méthode des fonctions de sensibilités, d'après la théorie développée par Fant et Pauli sur les perturbations de sections à l'intérieur du conduit vocal. Les travaux présent dans la littérature utilisant cette méthode imposent cependant une longueur fixe des fonctions d'aire, et par conséquent une longueur du conduit vocal fixe. Notre approche permet de régler ce problème en prenant en compte aussi les perturbations de longueur du conduit vocal. Une étude numérique et expérimentale permet de valider la technique dans le cas de voyelles orales du français
    corecore